InfoXtract Location Normalization: A Hybrid Approach To Geographic References In Information Extraction

نویسندگان

  • Huifeng Li
  • Rohini K. Srihari
  • Cheng Niu
  • Wei Li
چکیده

Ambiguity is very high for location names. For example, there are 23 cities named ‘Buffalo’ in the U.S. Based on our previous work, this paper presents a refined hybrid approach to geographic references using our information extraction engine InfoXtract. The InfoXtract location normalization module consists of local pattern matching and discourse co-occurrence analysis as well as default senses. Multiple knowledge sources are used in a number of ways: (i) pattern matching driven by local context, (ii) maximum spanning tree search for discourse analysis, and (iii) applying default sense heuristics and extracting default senses from the web. The results are benchmarked with 96% accuracy on our test collections that consist of both news articles and tourist guides. The performance contribution for each component of the module is also benchmarked and discussed.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

UB at TREC 12: HARD and Genomics Tracks

University at Buffalo (UB) participated in TREC-12 in Genomics and High Accuracy Retrieval from Documents (HARD) tracks. We explored some techniques that combine Information Retrieval and Information Extraction to perform the TREC tasks. We used an Information Extraction engine InfoXtract [3] from Cymfony Inc. to enhance retrieval results. For the Genomics primary task, documents retrieved usin...

متن کامل

A Hybrid Approach Based on Higher Order Spectra for Clinical Recognition of Seizure and Epilepsy Using Brain Activity

Introduction: This paper proposes a reliable and efficient technique to recognize different epilepsy states, including healthy, interictal, and ictal states, using Electroencephalogram (EEG) signals. Methods: The proposed approach consists of pre-processing, feature extraction by higher order spectra, feature normalization, feature selection by genetic algorithm and ranking method, and classif...

متن کامل

Location Normalization for Information Extraction

Ambiguity is very high for location names. For example, there are 23 cities named ‘Buffalo’ in the U.S. Country names such as ‘Canada’, ‘Brazil’ and ‘China’ are also city names in the USA. Almost every city has a Main Street or Broadway. Such ambiguity needs to be handled before we can refer to location names for visualization of related extracted events. This paper presents a hybrid approach f...

متن کامل

A geographic information system for gas power plant location using analytical hierarchy process and fuzzy logic

This study recommends a GIS-based (Geographic Information Systems) and multi-criteria evaluation for site selection of gas power plant in Natanz City of Iran. The multi-criteria decision framework integrates legal requirements and physical constraints related to environmental and economic concerns. It also builds a hierarchy model for gas power plant suitability. The methodologies used for site...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003